Pesquisa | Portal Regional da BVS

The Scottish Medical Imaging Archive: 57.3 Million Radiology Studies Linked to Their Medical Records.

Baxter, Rob; Nind, Thomas; Sutherland, James; McAllister, Gordon; Hardy, Douglas; Hume, Ally; MacLeod, Ruairidh; Caldwell, Jacqueline; Krueger, Susan; Tramma, Leandro; Teviotdale, Ross; Gillen, Kenny; Scobbie, Donald; Baillie, Ian; Brooks, Andrew; Prodan, Bianca; Kerr, William; Sloan-Murphy, Dominic; Herrera, Juan F R; van Beek, Edwin J R; Reel, Parminder Singh; Reel, Smarti; Mansouri-Benssassi, Esma; Mudie, Roy; Steele, Douglas; Doney, Alex; Trucco, Emanuele; Morris, Carole; Wallace, Robert; Morris, Andrew; Parsons, Mark; Jefferson, Emily.

Radiol Artif Intell ; 6(1): e220266, 2024 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-38166330

RESUMO

Keywords: MRI, Imaging Sequences, Ultrasound, Mammography, CT, Angiography, Conventional Radiography Published under a CC BY 4.0 license. See also the commentary by Whitman and Vining in this issue.

Assuntos

Mamografia , Radiologia , Radiografia , Registros Médicos , Escócia

Machine learning for classification of hypertension subtypes using multi-omics: A multi-centre, retrospective, data-driven study.

Reel, Parminder S; Reel, Smarti; van Kralingen, Josie C; Langton, Katharina; Lang, Katharina; Erlic, Zoran; Larsen, Casper K; Amar, Laurence; Pamporaki, Christina; Mulatero, Paolo; Blanchard, Anne; Kabat, Marek; Robertson, Stacy; MacKenzie, Scott M; Taylor, Angela E; Peitzsch, Mirko; Ceccato, Filippo; Scaroni, Carla; Reincke, Martin; Kroiss, Matthias; Dennedy, Michael C; Pecori, Alessio; Monticone, Silvia; Deinum, Jaap; Rossi, Gian Paolo; Lenzini, Livia; McClure, John D; Nind, Thomas; Riddell, Alexandra; Stell, Anthony; Cole, Christian; Sudano, Isabella; Prehn, Cornelia; Adamski, Jerzy; Gimenez-Roqueplo, Anne-Paule; Assié, Guillaume; Arlt, Wiebke; Beuschlein, Felix; Eisenhofer, Graeme; Davies, Eleanor; Zennaro, Maria-Christina; Jefferson, Emily.

EBioMedicine ; 84: 104276, 2022 Oct.

Artigo em Inglês | MEDLINE | ID: mdl-36179553

RESUMO

BACKGROUND: Arterial hypertension is a major cardiovascular risk factor. Identification of secondary hypertension in its various forms is key to preventing and targeting treatment of cardiovascular complications. Simplified diagnostic tests are urgently required to distinguish primary and secondary hypertension to address the current underdiagnosis of the latter. METHODS: This study uses Machine Learning (ML) to classify subtypes of endocrine hypertension (EHT) in a large cohort of hypertensive patients using multidimensional omics analysis of plasma and urine samples. We measured 409 multi-omics (MOmics) features including plasma miRNAs (PmiRNA: 173), plasma catechol O-methylated metabolites (PMetas: 4), plasma steroids (PSteroids: 16), urinary steroid metabolites (USteroids: 27), and plasma small metabolites (PSmallMB: 189) in primary hypertension (PHT) patients, EHT patients with either primary aldosteronism (PA), pheochromocytoma/functional paraganglioma (PPGL) or Cushing syndrome (CS) and normotensive volunteers (NV). Biomarker discovery involved selection of disease combination, outlier handling, feature reduction, 8 ML classifiers, class balancing and consideration of different age- and sex-based scenarios. Classifications were evaluated using balanced accuracy, sensitivity, specificity, AUC, F1, and Kappa score. FINDINGS: Complete clinical and biological datasets were generated from 307 subjects (PA=113, PPGL=88, CS=41 and PHT=112). The random forest classifier provided â¼92% balanced accuracy (â¼11% improvement on the best mono-omics classifier), with 96% specificity and 0.95 AUC to distinguish one of the four conditions in multi-class ALL-ALL comparisons (PPGL vs PA vs CS vs PHT) on an unseen test set, using 57 MOmics features. For discrimination of EHT (PA + PPGL + CS) vs PHT, the simple logistic classifier achieved 0.96 AUC with 90% sensitivity, and â¼86% specificity, using 37 MOmics features. One PmiRNA (hsa-miR-15a-5p) and two PSmallMB (C9 and PC ae C38:1) features were found to be most discriminating for all disease combinations. Overall, the MOmics-based classifiers were able to provide better classification performance in comparison to mono-omics classifiers. INTERPRETATION: We have developed a ML pipeline to distinguish different EHT subtypes from PHT using multi-omics data. This innovative approach to stratification is an advancement towards the development of a diagnostic tool for EHT patients, significantly increasing testing throughput and accelerating administration of appropriate treatment. FUNDING: European Union's Horizon 2020 Research and Innovation Programme under Grant Agreement No. 633983, Clinical Research Priority Program of the University of Zurich for the CRPP HYRENE (to Z.E. and F.B.), and Deutsche Forschungsgemeinschaft (CRC/Transregio 205/1).

Assuntos

Hipertensão , MicroRNAs , Biomarcadores , Catecóis , Humanos , Hipertensão/diagnóstico , Aprendizado de Máquina , Estudos Retrospectivos

An extensible big data software architecture managing a research resource of real-world clinical radiology data linked to other health data from the whole Scottish population.

Nind, Thomas; Sutherland, James; McAllister, Gordon; Hardy, Douglas; Hume, Ally; MacLeod, Ruairidh; Caldwell, Jacqueline; Krueger, Susan; Tramma, Leandro; Teviotdale, Ross; Abdelatif, Mohammed; Gillen, Kenny; Ward, Joe; Scobbie, Donald; Baillie, Ian; Brooks, Andrew; Prodan, Bianca; Kerr, William; Sloan-Murphy, Dominic; Herrera, Juan F R; McManus, Dan; Morris, Carole; Sinclair, Carol; Baxter, Rob; Parsons, Mark; Morris, Andrew; Jefferson, Emily.

Gigascience ; 9(10)2020 09 29.

Artigo em Inglês | MEDLINE | ID: mdl-32990744

RESUMO

AIM: To enable a world-leading research dataset of routinely collected clinical images linked to other routinely collected data from the whole Scottish national population. This includes more than 30 million different radiological examinations from a population of 5.4 million and >2 PB of data collected since 2010. METHODS: Scotland has a central archive of radiological data used to directly provide clinical care to patients. We have developed an architecture and platform to securely extract a copy of those data, link it to other clinical or social datasets, remove personal data to protect privacy, and make the resulting data available to researchers in a controlled Safe Haven environment. RESULTS: An extensive software platform has been developed to host, extract, and link data from cohorts to answer research questions. The platform has been tested on 5 different test cases and is currently being further enhanced to support 3 exemplar research projects. CONCLUSIONS: The data available are from a range of radiological modalities and scanner types and were collected under different environmental conditions. These real-world, heterogenous data are valuable for training algorithms to support clinical decision making, especially for deep learning where large data volumes are required. The resource is now available for international research access. The platform and data can support new health research using artificial intelligence and machine learning technologies, as well as enabling discovery science.

Assuntos

Big Data , Radiologia , Inteligência Artificial , Humanos , Escócia , Software

Knowledge Driven Phenotyping.

Wu, Honghan; Wang, Minhong; Zeng, Qianyi; Chen, Wenjun; Nind, Thomas; Jefferson, Emily; Bennie, Marion; Black, Corri; Pan, Jeff Z; Sudlow, Cathie; Robertson, Dave.

Stud Health Technol Inform ; 270: 1327-1328, 2020 Jun 16.

Artigo em Inglês | MEDLINE | ID: mdl-32570642

RESUMO

Extracting patient phenotypes from routinely collected health data (such as Electronic Health Records) requires translating clinically-sound phenotype definitions into queries/computations executable on the underlying data sources by clinical researchers. This requires significant knowledge and skills to deal with heterogeneous and often imperfect data. Translations are time-consuming, error-prone and, most importantly, hard to share and reproduce across different settings. This paper proposes a knowledge driven framework that (1) decouples the specification of phenotype semantics from underlying data sources; (2) can automatically populate and conduct phenotype computations on heterogeneous data spaces. We report preliminary results of deploying this framework on five Scottish health datasets.

Assuntos

Registros Eletrônicos de Saúde , Armazenamento e Recuperação da Informação , Semântica

Identifying care-home residents in routine healthcare datasets: a diagnostic test accuracy study of five methods.

Burton, Jennifer K; Marwick, Charis A; Galloway, James; Hall, Christopher; Nind, Thomas; Reynish, Emma L; Guthrie, Bruce.

Age Ageing ; 48(1): 114-121, 2019 01 01.

Artigo em Inglês | MEDLINE | ID: mdl-30124764

RESUMO

Background: there is no established method to identify care-home residents in routine healthcare datasets. Methods matching patient's addresses to known care-home addresses have been proposed in the UK, but few have been formally evaluated. Study design: prospective diagnostic test accuracy study. Methods: four independent samples of 5,000 addresses from Community Health Index (CHI) population registers were sampled for two NHS Scotland Health Boards on 1 April 2017, with one sample of adults aged ≥65 years and one of all residents. To derive the reference standard, all 20,000 addresses were manually adjudicated as 'care-home address' or not. The performance of five methods (NHS Scotland assigned CHI Institution Flag, exact address matching, postcode matching, Phonics and Markov) was evaluated compared to the reference standard. Results: the CHI Institution Flag had a high PPV 97-99% in all four test sets, but poorer sensitivity 55-89%. Exact address matching failed in every case. Postcode matching had higher sensitivity than the CHI flag 78-90%, but worse PPV 77-85%. Area under the receiver operating curve values for Phonics and Markov scores were 0.86-0.95 and 0.93-0.98, respectively. Phonics score with cut-off ≥13 had PPV 92-97% with sensitivity 72-87%. Markov PPVs were 90-95% with sensitivity 69-90% with cut-off ≥29.6. Conclusions: more complex address matching methods greatly improve identification compared to the existing NHS Scotland flag or postcode matching, although no method achieved both sensitivity and positive predictive value > 95%. Choice of method and cut-offs will be determined by the specific needs of researchers and practitioners.

Assuntos

Conjuntos de Dados como Assunto , Instituição de Longa Permanência para Idosos/estatística & dados numéricos , Casas de Saúde/estatística & dados numéricos , Idoso , Conjuntos de Dados como Assunto/estatística & dados numéricos , Feminino , Humanos , Masculino , Estudos Prospectivos , Reprodutibilidade dos Testes , Escócia

The research data management platform (RDMP): A novel, process driven, open-source tool for the management of longitudinal cohorts of clinical data.

Nind, Thomas; Galloway, James; McAllister, Gordon; Scobbie, Donald; Bonney, Wilfred; Hall, Christopher; Tramma, Leandro; Reel, Parminder; Groves, Martin; Appleby, Philip; Doney, Alex; Guthrie, Bruce; Jefferson, Emily.

Gigascience ; 7(7)2018 07 01.

Artigo em Inglês | MEDLINE | ID: mdl-29790950

RESUMO

Background: The Health Informatics Centre at the University of Dundee provides a service to securely host clinical datasets and extract relevant data for anonymized cohorts to researchers to enable them to answer key research questions. As is common in research using routine healthcare data, the service was historically delivered using ad-hoc processes resulting in the slow provision of data whose provenance was often hidden to the researchers using it. This paper describes the development and evaluation of the Research Data Management Platform (RDMP): an open source tool to load, manage, clean, and curate longitudinal healthcare data for research and provide reproducible and updateable datasets for defined cohorts to researchers. Results: Between 2013 and 2017, RDMP tool implementation tripled the productivity of data analysts producing data releases for researchers from 7.1 to 25.3 per month and reduced the error rate from 12.7% to 3.1%. The effort on data management reduced from a mean of 24.6 to 3.0 hours per data release. The waiting time for researchers to receive data after agreeing a specification reduced from approximately 6 months to less than 1 week. The software is scalable and currently manages 163 datasets. A total 1,321 data extracts for research have been produced, with the largest extract linking data from 70 different datasets. Conclusions: The tools and processes that encompass the RDMP not only fulfil the research data management requirements of researchers but also support the seamless collaboration of data cleaning, data transformation, data summarization and data quality assessment activities by different research groups.

Assuntos

Sistemas Computacionais , Estudos Longitudinais , Informática Médica/métodos , Bases de Dados Factuais , Humanos , Internet , Linguagens de Programação , Controle de Qualidade , Reprodutibilidade dos Testes , Pesquisa , Escócia , Software , Universidades

Mapping Local Codes to Read Codes.

Bonney, Wilfred; Galloway, James; Hall, Christopher; Ghattas, Mikhail; Tramma, Leandro; Nind, Thomas; Donnelly, Louise; Jefferson, Emily; Doney, Alexander.

Stud Health Technol Inform ; 234: 29-36, 2017.

Artigo em Inglês | MEDLINE | ID: mdl-28186011

RESUMO

Background & Objectives: Legacy laboratory test codes make it difficult to use clinical datasets for meaningful translational research, where populations are followed for disease risk and outcomes over many years. The Health Informatics Centre (HIC) at the University of Dundee hosts continuous biochemistry data from the clinical laboratories in Tayside and Fife dating back as far as 1987. However, the HIC-managed biochemistry dataset is coupled with incoherent sample types and unstandardised legacy local test codes, which increases the complexity of using the dataset for reasonable population health outcomes. The objective of this study was to map the legacy local test codes to the Scottish 5-byte Version 2 Read Codes using biochemistry data extracted from the repository of the Scottish Care Information (SCI) Store. METHODS: Data mapping methodology was used to map legacy local test codes from clinical biochemistry laboratories within Tayside and Fife to the Scottish 5-byte Version 2 Read Codes. RESULTS: The methodology resulted in the mapping of 485 legacy laboratory test codes, spanning 25 years, to 124 Read Codes. CONCLUSION: The data mapping methodology not only facilitated the restructuring of the HIC-managed biochemistry dataset to support easier cohort identification and selection, but it also made it easier for the standardised local laboratory test codes, in the Scottish 5-byte Version 2 Read Codes, to be mapped to other health data standards such as Clinical Terms Version 3 (CTV3); LOINC; and SNOMED CT.

Assuntos

Sistemas de Informação em Laboratório Clínico , Integração de Sistemas , Confiabilidade dos Dados , Curadoria de Dados , Humanos , Escócia

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA